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[57] ABSTRACT 

A method and a structure to implement maximum- 
likelihood decoding of convolutional codes on a net- 
work of microprocessors interconnected as an n-dimen- 
sional cube (hypercube). By proper reordering of states 
in the decoder, only communication between adjacent 
processors is required. Faster and more efficient opera- 
tion is enabled, and decoding of large constraint length 
codes is feasible using standard VLSI technology. 

11 Claims, 12 Drawing Figures 
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METHOD AND APPARATUS FOR 
IMPLEMENTING A MAXIMUM-LIKELIHOOD 
DECODER IN A HYPERCUBE NETWORK 

5 

BACKGROUND OF THE INVENTION 1. Origin of 
the Invention 

The invention described herein was made in the per- 
formance of work under a NASA Contract and is sub- 
ject to the provisions of Public Law 96-517 (35 USC 10 
202) in which the contractor has elected to retain title. 

2. Field of the Invention 

The present invention is concerned with a method for 
maximum-likelihood decoding of convolutional codes 
on a network of microprocessors, and apparatus for 15 
executing this method. 

3. Brief Description of the Prior Art 

A concurrent computing system in which individual 
computers, each with a computational processor and a 
message-handling processor, are connected as a hyper- 20 
cube is described and claimed in an application entitled 
“Concurrent Computing System Through Asynchro- 
nous Communication Channels, Ser. No. 754,828, filed 
on July 12, 1985 and assigned to California Institute of 
Technology. As there described, N nodes (numbered 0, 25 
1, — N — 1) are connected together in a binary (or Bool- 
ean) n-dimensional cube in which N=2* or n~log 2 N. 
The above-identified application depicts one represen- 
tative hypercube connection of a network of processors 
in which the processors are located at the vertices of an 30 
n-dimensional cube and communicate with each other 
by bidirectional communication links only along the 
edges of the cube. One manner of transmitting data and 
representative examples of microprocessor hardware 
and software suitable for performing the data transfer 35 
feature of this invention is fully disclosed in the above- 
identified application. 

Hypercube-conneeted multiprocessors having up to 
128 nodes (n=7) are known, and will soon be extended 
to 1,024 nodes (n=10) and higher. As the number of 40 
nodes increases, it is imperative that the number of 
connections per decoder at each node be kept to a mini- 
mum. Otherwise, the existing pin-limitation constraint 
of VLSI decoders cannot be accommodated. 

Maximum-likelihood decoding of convolutional 45 
codes is also well known. The Viterbi decoding algo- 
rithm is commonly used for such decoding, and many 
textbooks such as “Theory and Practice of Error Con- 
trol Codes” by Richard E. Blahut, Addison-Wesley 
Publishing Company, Inc. Copyright 1983 describe the 50 
encoding/decoding of convolutional codes. A Viterbi 
decoding algorithm is conceptualized in the text and a 
succinct summary of a trellis diagram, stages, stages and 
common steps to obtain accumulated metrics and survi- 
vors is described at pages 350 through 353 and 377 55 
through 382 of that text. 

Convolutional codes are widely used in digital com- 
munication systems to decrease the probability of error 
on a given noisy channel (coding gain). They are char- 
acterized by the constraint length K and by the code 60 
rate given by the ratio ko/no, where no channel symbols 
are generated by the encoder for each ko input data bits. 
Details can be found, for example, in the above-noted 
“Theory and Practice of Error Control Codes,” by R. 

E. Blahut. 65 

Convolutional codes can achieve large coding gain if 
they use large memory m, or equivalently, large con- 
straint length K=m+ko. An encoder for such codes is 
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a finite-state machine with 2 m states. The complexity of 
a maximum-likelihood decoder is approximately pro- 
portional to the number of states, i.e., it grows exponen- 
tially with m. 

The task of the decoder is to consider all possible 
paths on a trellis of about 5m stages, and find the most 
likely path, according to a specified goodness criterion. 
The goodness criterion is described in the article “The 
Viterbi Algorithm,” by G. D. Forney, Proc. IEEE, 
Vol. 61 (1973), pp. 263-278. 

SUMMARY OF THE INVENTION 

Multiprocessor systems have the potential to obtain 
large computation power. This is possible if one can 
solve the problem of how to decompose the decoding 
algorithm. There are two key requirements in the prob- 
lem decomposition: (1) divide the algorithm in equal 
parts, in order to share equally the resources available in 
each processor, and (2) minimize the communication 
between the parts, so that each processor needs to share 
information only with nearest neighbors. 

Since a single microprocessor or VLSI chip cannot 
accommodate all the functions required to implement 
complex decoders, methods must be found to efficiently 
use a network of processors. Parallel decoding is ac- 
complished by a network of processors with each pro- 
cessor connected as the edges of an n-dimensional cube. 
The paths to be examined and stored by each processor 
are shared only with neighboring processors on the 
cube. Each neighbor is interrogated to find the good- 
ness of the updated paths and therefore decide which 
ones should be stored. 

The most basic embodiment assigns each available 
processor to each single state of the decoder. The 
proper reordering of paths and states is obtained by 
making a given processor x act as state x, where x is a 
function of x and the stage of the trellis, based on cyclic 
shifts of the binary label representing x. 

The method also provides a way to group sets of 
S=2 5 states into each processor still requiring only com- 
munication between neighboring processors. This ar- 
rangement yields high computational efficiency for 
complex codes. 

FEATURES OF THE INVENTION 

It is a feature of the invention to provide a method 
and apparatus for maximum-likelihood decoding of 
convolutional codes on a network of microprocessors 
interconnected as a n-dimensional cube and having each 
processor compute part of all paths of a trellis diagram 
in parallel with concurrent computations by other pro- 
cessors in the cube. This feature results in high effi- 
ciency, in terms of decoding speed, and allows use of 
codes more complex than those presently decodable 
with known methods. 

In accordance with the present invention, a method is 
provided for decoding convolutional codes, by the 
accumulated metric and survivor steps of the known 
Viterbi algorithm, but with the novel and unique differ- 
ence that decoding operations are performed in parallel 
by a network of processors placed at the vertices of an 
n-dimensional cube, and bidirectionally communicating 
with neighboring processors only along the edges of the 
cube. This decoding method can operate with high 
efficiency because the parallel decoding operations 
concerning each state of the decoder are performed in a 
suitable and novel order within each processor. 
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Further provided is an arrangement for implementing 
this novel method, comprising a decoder structure 
which is characterized by: 
a network of N=2 n processors interconnected as an 
n-dimensional cube, having bidirectional communi- 5 
cation along the edges of the cube for receiving/- 
transmitting to neighboring processors accumu- 
lated metrics and hypothesized data sequences 
(survivors), to internally store said quantities, and 
to perform comparisons between accumulated met- 10 
rics; 

input means for sending the received channel symbol 
sequence to all processors in the network; and 
output means for delivering the decoded, most likely, 
data sequence to a user. 15 

The arrangement can accommodate different num- 
bers S=2 5 of the M=2 m decoder states into each pro- 
cessor, depending on the code complexity and the num- 
ber of available processors N=2 m , where M=SXN. 

BRIEF DESCRIPTION OF THE DRAWINGS 20 

FIG. 1 is a trellis diagram of the Viterbi algorithm, 
showing all the possible state transitions for an 8-state 
decoder (M=2 m =8, m=3). Only four stages are 
shown, but the trellis can be extended as necessary; 25 
FIG. 2 is the hypercube trellis diagram of this inven- 
tion showing transitions between processors P / of the 
hypercube and the states represented by each processor. 
The trellis can be extended by repeating the m stages 
shown, to form a periodic structure; 30 

FIG. 3 is an example of an n-dimensionai cube (n— 3) 
network of microprocessors placed at the vertices of 
the cube, where solid lines show connections used in a 
given stage; 

FIG. 4 is a specific decoder structure where two 
states are representated by each of the four processors; 

FIG. 5 is a block diagram of the internal arrangement 
of each processor in the network; 

FIG. 6, including FIGS. 6a, 6b, and 6c, contains the 
arrangements for decoders with M=8 states and S= 1, 

2, or 4 states per processor, respectively; 

FIG. la is a decoder structure on a two-dimensional 
cube for an 8-state (m=3) code of ko/n 0 , where Jq,=2; 

FIG. lb is a decoder trellis for an 8-state (m=3) code 
of rate k^/n^, where k<>=2; 

FIG. 8a is a trellis diagram of a four state decoder 
showing how paths are eliminated according to the 
usual Viterbi algorithm, and 
FIG. 8b is a trellis diagram for four states showing 
how paths are eliminated during the decoding operation 
of this invention. 

DETAILED DESCRIPTION OF THE 
DRAWINGS 

Convolutional codes are well known. Such codes can 
achieve large coding gain, i.e., improvement on the 
overall performance of a digital communication link, if 
they use large memory m, or equivalently, large con- 
straint length K—m- 1- 1 (in the simple case when k 1). 

An encoder for such codes is a finite-state machine with 
2 m states. Decoders for convolutional codes can be 
effectively described in terms of a graph called a trellis 
diagram, as shown in FIG. 1 for an m=3, 8-state code. 

The left-hand column of FIG. 1 shows all of the 
possibilities for a three-bit grouping in a finite state 
coding machine of three shift register cells, or memory 
m=3. In keeping with the Viterbi decoding algorithm, 
a decoder would examine all eight possibilities by using 


an iterative trial. Each iteration corresponds to a sepa- 
rate vertical column such as iteration No. 1, No. 2, etc., 
through 15. The number of iterations in a time se- 
quence, with time being depicted horizontally, is shown 
in FIG. 1 for 15 iterations. It is a rule of thumb that 
15 = 5m iterations will almost always yield the proper 
result. Obviously more iterations can be tried, but the 
extra number of iterations normally does not achieve 
any significantly-improved results. 

The task of the decoder is to consider all possible 
paths on a trellis of about 5m stages, and find the most 
likely path, according to a specified goodness criterion. 
The goodness criterion is simply a number of merit 
which will be explained in more detail hereinafter for a 
typical simplified example of FIG. 8. Suffice it to say at 
this point that a received sequence of symbols is viewed 
by the decoder as simply a trial of hypothesized sequen- 
ces. As each new received symbol is considered, the 
weighted values of trellis paths are reviewed by the 
decoding algorithm and the lowest number (indicating 
the highest level of goodness) is temporarily chosen as 
the best possible candidate to match the one that was 
originally encoded. 

The trellis diagram 10 in FIG. 1 is used in the Viterbi 
algorithm to represent all possible transitions between 
states as described in the above-mentioned articles. In 
particular, at each stage all eight states are examined 
and only one of the two possible paths 11, 12 coming 
into a state (such as 000) is preserved, along with its 
goodness or likelihood measure, called an accumulated 
metric. At a given time (stage), such as time Ti, each 
state is associated with one (and only one) path (path 12, 
for example) reaching it. The algorithm performs the 
following steps: 

(1) Update the value of the accumulated metric of the 
two paths converging into each state, according to a 
known rule This known rule, described in detail in the 
above-referenced paper by Forney consists, in sum- 
mary, of adding the so-called branch metric, which 
depends on the received symbol, to the accumulated 
metrics of the paths converging into each state; 

(2) Choose the preferred path between those two, 
according to their metrics; and 

(3) Store the chosen path and its metric. 

Before describing the trellis for the processor/de- 
coder method and apparatus of this invention, a brief 
review of the simplified diagram of FIG. 3 is believed 
helpful. The earlier-identified application discloses a 
complete concurrent processing system in which bidi- 
rectional communication links transmit data along the 
edges of a cube. For simplicity sake each mode, or 
processor, is shown simply as a dot. In FIG. 3, the 
bidirectional communication links are shown as solid 
lines with double-headed arrows 30* through 33*, 35 z 
through 38 z and 40^ through 43^, respectively. The x, y 
and z indicate directions of communication in the cube. 
Associated with the comer locations of the cube net- 
work are binary labels which identify the processors. 
For example, processor Po is identified by the tribit 
group 000, Pi by the tribit group 001, etc. Each proces- 
sor Pooo (Po) through Pm (P7 is connected only to its n 
neighbors. It is desirable to use direct communication 
links between processors, in order to speed up commu- 
nication. Bidirectional link 30* delivers bit sequences 
back and forth between processors Po and Pi in any 
well-known manner. Likewise, as shown, link 31* is 
connected between P4 and P5, link 32* is connected 
between P6 and P7, and link 33* is connected between 
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P 2 and P 3 . In the second stage of FIG. 3, bidirectional 
link 35 z is connected between processors Po and P 2 , link 
3 6 Z is connected between P 4 and P 6 , etc. as is there 
depicted. 

Since complex codes have a very large number of 
states, it becomes impossible to perform sequentially all 
the above steps in reasonable time with a single proces- 
sor. It is therefore desirable to share the work-load 
among many processors working in parallel. A conve- 
nient topology for a network of processors is the n- 
dimensional cube structure described above. Processors 
must be able to communicate among themselves in 
order to exchange intermediate results. Therefore, the 
decoding process will include some time spent in com- 
putations internal to each processor, and some time for 
interprocessor communication. It is this latter commu- 
nication time which must be kept as small as possible to 
yield high decoding efficiency. 

The advantage which can be achieved by the inven- 
tion is the ability to use large constraint length convolu- 
tional codes, which yield high coding gain and to keep 
acceptable decoding speed with feasible hardware. 

This is due to the fact that the efficiency of the 
method, given by 

Ncta __ sequential alg. time 

^ “ Not 0 + Nfa ~ N X (parallel alg. time) * 

where N 0 is the number of parallel metric comparisons, 
t 0 is the comparison time, N* is the number of parallel 
survivor exchanges, and X t is the exchange time. The 
efficiency remains high even when a large number of 
processors are used. 

While t 0 and t f depend on the hardware technology 
used for the processors, the method yields 

N t =S(m-s)=(Mn/N) 

N 0 —Sm=(M m /N). 

so that the efficiency is always above the ratio 

h 

to + *t 

which is reached when N=M. 

We assign each state of FIG. 1 to a different proces- 
sor, so that all operations concerning all states can be 
done simultaneously (in parallel) at each stage, and 
sequentially stage by stage. Upon examination, how- 
ever, I discovered that if we assign state 0 to processor 
Po, state 1 to processor Pi, and so on up to state N— 1 to 
processor Piv-i (P 7 ) and we consider processors con- 
nected as in FIG. 3, then links between processors 
which are not directly connected in FIG. 3 would be 
necessary to implement all links between the states or 
processors in FIG. 1. My novel solution included map- 
ping the states in the trellis 10 of FIG. 1 as the hyper- 
cube trellis 20 of FIG. 2. 

According to the principles of my invention, a given 
processor is not assigned to a fixed state, as was the case 
in FIG. 1. Instead, for my invention the processors are 
identified by a binary label (Po=Pooo, Pi = Pooi, etc.) as 
shown in FIG. 2, and the trellis labelling and stage 
order is uniquely defined by the following formula. In 
particular, processor x represents state x at stage k, if 

x^p( k \x) t 


where p(*)(.) is the cyclic right shift of x by k binary 
positions. A path through given states in FIG. 1 is thus 
represented by a specified equivalent path in FIG. 2, 
5 which passes through the same states. This means that 
there is a well-defined correspondence between paths in 
FIGS. 1 and 2. According to this correspondence, a 
Viterbi-type algorithm, based on the trellis of FIG. 1, 
can be performed on the hypercube trellis of FIG. 2. 

10 The interesting and useful property of the trellis dia- 
gram of FIG. 2 is that all the required links between 
processors are now exactly those available in the hyper- 
cube network of FIG. 3. The first stage labeled as such 
in FIG. 3 shows how the first stage of FIG. 2 can be 
15 performed by using the connections 30* through 33* 
(marked with solid lines and double-headed arrows) 
between processors Po, Pi and P 2 , P 3 and P 4 , P 5 and P 6 , 
P7, respectively. Similarly, the second and third stages 
of FIG. 3 show the bidirection communication lines 
20 required for implementation of the second and third 
stages of FIG. 2. It should be understood that the em- 
bodiment of the decoder on the network has been ex- 
plained for the case m=3, but it clearly can be general- 
ized to other decoder sizes and hypercube dimensional- 
25 ity. 

When the number of states is larger than the available 
or feasible number of processors, it becomes necessary 
to assign more than one state per processor. This can be 
done as shown in FIG. 4, where S=2 states are assigned 
30 to each processor. The required interprocessor commu- 
nication links are provided by a two-cube-connected 
processor system which requires two decoding opera- 
tions within each state. The method and apparatus of 
this invention thus generalizes to a number of states per 
35 processor which is a power of two, i.e., S=2 5 , where S 
is the number of states per processor. 

The simplest embodiment of the invention is shown in 
FIG. 5 for the case of one state per processor S= 1. The 
block diagram represents the arrangement used in each 
40 processor of the hyper cube, where the input and output 
devices 60 and 65 sequentially connect to neighbors 
along each dimension of the cube, one dimension at a 
time, as shown by stages 1, 2 and 3 of FIG. 3. The block 
diagram of FIG. 5 may be thought of as a particular 
45 means for performing the several desired functions. 
FIG. 5 is a timed operation which is readily perform- 
able by any well-known and available processor, and to 
this extent FIG. 5 may be thought of as a flow diagram 
for the various computations. 

50 Although not depicted, it should be understood that 
all processors are initialized to the same initial state. 
Operation of the decoder requires that blocks of initial- 
izing data be loaded in every processor by an operation 
called “broadcasting.” In the hyper cube network of 
5 5 processors under consideration, data from a host pro- 
cessor is directly exchanged only through node zero 
(the origin of the cube). An efficient concurrent method 
is required to broadcast a message from node zero to all 
other nodes. Since the diameter D of an n-cube is n, a 
60 lower bound on the broadcasting time is n time units 
(where the unit is the time to send a message to a neigh- 
bor). 

Assume that a message is in node zero, at time zero. 
In each subsequent time slot t* send messages in parallel 
65 from each node 

x=[x m _i, . . . , X*+ 1 , 0, x/c-1, . . . , xq] 
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to each node 

x'=[x m _i, . . . , xjt + i, l, x*_i, *oL 

the neighbors along dimension k. After n time units, the 5 
message has propagated to all nodes. 

Even though this method does not minimize the num- 
ber of communications (with the advantage of a very 
simple indexing), it optimizes the total broadcasting 
time to n time units. The result is clearly optimum, since 
it achieves the lower bound. 

At the start of the hypercube decoding algorithm, 
input device 60 loads a received channel sequence to be 
decoded into a suitable storage device 69. As noted 
earlier, the preferred embodiment of this invention is 15 
achieved by VLSI techniques. Thus the storage device 
69 may advantageously be an addressable storage space 
in the processor memory. A sequence of processor com- 
putations are then performed by the decoder. The input 
sequence, stored in memory 69, is used to update both a 
locally-stored accumulated metric and an accumulated 
metric that has been received from a neighboring pro- 
cessor. The local metric is stored in a local metric mem- 
ory 70. That metric value is then updated in the update 
metric device 71. 

Meanwhile, an accumulated metric value from a 2 
neighboring processor has been supplied by input de- 
vice 60 to an accumulated metric storage 75 which is 
used to store the neighbor’s accumulated metric value. 
The metric at storage 75 is updated and made available 
in the update metric unit 76. A suitable comparison 30 
between those updated metric values is achieved by 
comparator 72 and the proper metric value is retained 
as a new local metric value. Note that comparator 72 
supplies that new local metric value both to the local 
metric memory 70 and to the output device 65. 35 

A comparable operation takes place for the survivor 
values. Thus, input device 60 receives the survivor 
value from the current neighbor, i.e., the neighbor 
along the dimension currently in use. That survivor 
value is stored at 80 and updated at 81. The survivor is 40 
simply updated by a table look-up method performed in 
accordance with the well-known Viterbi algorithm. 
For this reason no control lead is shown for update 
circuits 81 and 86. 

An updated local survivor and an updated neighbor 45 
survivor are supplied to a comparator 90 in order to 
choose the most likely path. The result of the compari- 
son under control of the selection command from com- 
parison circuit 72 is used to select either the updated 
local survivor or the updated neighbor survivor in 90. 50 
The selected quantities (metric and survivor) are there- 
after fed back and stored in the processor as the next 
upcoming local quantities. Note that both are also avail- 
able to be sent, via output 65, to a neighbor processor 
along the appropriate dimension of the cube. 55 

The decoded sequence is formed from the survivors 
as in the known Viterbi algorithm and is stored in a 
decoded register sequence 95. The decoded sequence 
output from storage 95 is delivered to a user. Note that 
input and output leads are labeled at the origin node in 60 
FIG. 3. 

In FIG. 6a the trellis diagram is as that shown earlier 
in FIG. 2 and thus needs no further description. FIG. 6b 
describes the implementation of the same decoder of 
FIG. 62 a, but with two states per processor (S—2). 65 
Note that double lines mean that an exchange of two 
metrics and two survivors along each bidirectional link 
between processors is required. SLr Marly, in FIG. 6c, 
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each processor performs all the operations on a set of 
four states (S=4). 

When more than one state is assigned to each proces- 
sor, the invention can be applied to the decoding of a 
more general class of codes having rate ko/n 0 where 
k 0 > L FIG. 7 shows a structure for decoding an 8-state 
code with k,=2 on a two-dimensional cube with four 
processors. Again, all the required links in the trellis of 
FIG. la can be implemented on a 2-cube as in FIG. lb . 

FIG. 8 is an example showing how various paths are 
eliminated during decoding. The solid line in FIGS. 8a 
and 8b is the path which has been chosen. 

Consider, as an example, a decoder for a 4-state rate 
one-half convolutional code given by the generator 
polynominals gx = 1 1 1 and g2=101. Such a received 
sequence is given in FIG. 86. 

A conventional decoder searches for the maximum- 
likelihood path on the graph of FIG. 8a where, for a 
given received sequence, all survivor paths considered 
are shown. The decoder of my invention operates as 
shown in FIG. 86, where the same survivor paths are 
shown in terms of transitions between processors. The 
decoded sequence is obviously the same in both cases, 
but the transitions in FIG. 86 involve only neighboring 
processors on a hypercube. 

While the invention has been particularly shown and 
described with reference to preferred embodiments 
thereof, it will be understood by those skilled in the art 
that various changes in form and details may be made 
therein without departing from the spirit and scope of 
the invention. 

What is claimed is: 

1. A method for maximum-likelihood decoding of 
convolutional codes for a received code sequence hav- 
ing 2 m states (where m is a whole integer) by a plurality 
of interconnected processors each of which is normally 
in communication with all the other processors of said 
plurality and each provided with a decoding algorithm 
means for deriving/interchanging decoding parameters 
including branch metrics, accumulated metrics and 
survivors in an m-state trellis with each processor 
fixedly assigned to one state of said trellis, the improve- 
ment comprising: 

connecting each one of a plurality of 2 m processors, 
equipped with said decoding algorithm means, in 
an n-cube configuration having bidirectional com- 
munication links along the edges only of said cube 
and certain processors thereof not having a direct 
communication link between other processors of 
said n-cube configuration; 

mapping an equivalent trellis for said n-cube configu- 
ration wherein processors represent more than one 
state and thus have direct communication links 
between other processors of said n-cube configura- 
tion on said equivalent trellis; and 

deriving/interchanging the branch and accumulated 
metrics between the processors which represent all 
of the states in said equivalent trellis in order to 
select, by said decoding algorithm means, the maxi- 
mum likelihood path from said equivalent trellis, 

2. A method according to claim 1, and wherein said 
deiving/interchanging step is further characterized by: 

communicating only between adjacent neighboring 
processors in said n-cube configuration wherein 
each processor represents different states of the 
decoder at different stages on said equivalent trel- 
lis, according to the relation: 
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X—p( k \x), 


meaning that processor x in a binary notation rep- 
resent state x at stage k, where pW(x) is the cyclic 5 
right shift of x by k binary positions. 

3. A method according to claim 1 comprising the 
additional preliminary steps prior to said deriving/inter- 
changing step of: 

designating one processor of the n-cube network as 10 
an origin/output processor; and 

broadcasting said received signals from said origin 
processor to all of the remaining n-cube connected 
processors. 

4. A maximum likelihood decoding system for deter- I 5 
mining accumulated metrics and selecting survivor 
paths during a repetitive set of stages of a trellis code 
having a finite number of states M, (where M is a whole 
integer), said system comprising: 

N processors (N = 2 n ) placed at the vertices of an 20 
n-dimensional cube, where n is a whole integer, 
with each processor assigned to represent more 
than one state at different stages, k, in a repeating 
set of stages in the trellis code — that is — each pro- 
cessor x is assigned state x according to the rela- 
tion: 

£=p(*)(x), 

meaning that processor x represents state x at stage 30 
k, where p(*)(x) is the cyclic right shift of x by k 
binary position and k is number of the stages which 
make up a set in the trellis; 

means for comprising at each processor accumulated 
metrics and survivors at said different states of the 35 
trellis code at correspondingly different stages 
numbered (k=0, k— l,k=2, etc.) on the trellis; and 

bidirectional processor-communication links along 
the edges only of said n-dimensional cube for ex- 
changing said computed accumulated metrics and 
survivors only between adjacent neighboring pro- 
cessors. 

5. A decoding system in accordance with claim 4, 
wherein one processor from said plurality is designated 45 
as an input processor and further comprising: 

input means at said input processor for receiving a 
coded line sequence to be broadcast to every one of 
said processors; and 

output means at said one processor for outputting a 
chosen maximum-likelihood sequence as a decoded 
output from said decoding system. 

6. A maximum-likelihood decoding method for de- 
coding sequences generated by a convolutional encod- 
ing having M=2 m states by considering at the decoder 55 
all possible paths on an n-cube trellis and finding the 
most likely path according to a specified goodness crite- 
rion involving decoding parameters, said method com- 
prising the steps of: 

forming a concurrent processor network of N— 2 n 60 
microprocessors interconnected as an n-dimen- 
sional cube having one microprocessor each at the 
vertices of said cube and bidirectional communica- 
tion links for said microprocessors only along the 
edges of said cube and corresponding to all paths to 65 
be checked on said n-cube trellis by transmitting- 
/receiving said decoding parameters over said 
communication links; 
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broadcasting a sequence to be decoded to every one 
of said microprocessors; 

assigning each of the M states on said n-cube trellis to 
each of the N microprocessors according to the 
formula: 


X=p( k \x) 


wherein, when expressed in binary notations, mi- 
croprocessor x represents state x at stage k and the 
function pW is the cyclic right shift of x by k binary 
positions and M, m, N and n are whole integers; 
computing and storing locally at each microproces- 
sor the decoding parameters; and 
transmitting/receiving said decoding parameters be- 
tween neighboring processors in said n-dimen- 
sional cube. 

7. A decoding method for determining the maximum- 
likelihood path of a trellis code having M— 2" states 
representing a line sequence generated by a convolu- 
tional encoder and subjected to channel noise, said 
method involving known decoding algorithms having 
decoding parameters to be transmitted/received be- 
tween N (N=2 rt ) decoding processors, with M, N and n 
being whole integers comprising the steps of: 

dividing a decoding algorithm for said processors 
into equal parts as a power of two from all of said 
states for processing of each of said equal parts by 
one each of a plurality of N concurrent processors; 
locating N— 2 n processors at the vertices of an n- 
dimensional cube having bidirectional communica- 
tion paths for said processors only along the edges 
of said cube; 

mapping a trellis with certain processors representing 
more than one state on the trellis in order to repre- 
sent all of the M states on said trellis by said N 
processors, with each of said processors having 
direct transmitting/receiving paths from neighbor- 
ing processors along the dimensions of said cube 
and in said trellis; and 

computing the maximum-likelihood path for each of 
the M states by each of the n microprocessors said 
trellis wherein said processors represent more than 
one state in an order according to: 


x=p( k \x) 


wherein microprocessor x represents a set of states 
x at stages k and the function p(*) is the cyclic right 
shift of x by k binary positions. 

8. A maximum likelihood decoding system for deter- 
mining accumulated metrics and selecting survivor 
paths of M states of a trellis code in network of N=2 n 
processors with each processor placed at the vertices of 
an n-dimensional cube and representing a set of S=2 S 
states of the decoder formed from said processors; said 
decoder characterized in that: each processor repre- 
sents different states of the trellis code at different stages 
on the trellis, according to the relation: 


*=<*>(*), 


meaning that processor x represents state x at state 
k, where (*) (x) is the cyclic right shift of x by k 
binary positions. 

9. A maximum-likelihood decoding system in accor- 
dance with claim 8 and further comprising: 
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bidirectional processor-communication links along 
the edges only of said n-dimensional cube for ex- 
changing accumulated metrics and survivors. 

10. A decoding system in accordance with claim 9 5 
wherein each of said processors is characterized by 
means at each processor for computing its local accu- 
mulated metrics and for updating the locally- 
accumulated metrics by adding thereto branch metrics, \q 
and said decoding system further comprises: 
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comparing means in each processor for comparing 
the local accumulated metrics with those accumu- 
lated metrics received over said links from said 
neighboring processors. 

11. A decoding system in accordance with claim 10 
and further comprising: 

an origin processor at one vertex only of said n- 
dimensional cube; and 

means for outputting to a user a decoded signal from 
said origin processor. 
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